Search CORE

500 research outputs found

Speaker Identification Based On Discriminative Vector Quantization And Data Fusion

Author: Zhou Guangyu
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2005
Field of study

Speaker Identification (SI) approaches based on discriminative Vector Quantization (VQ) and data fusion techniques are presented in this dissertation. The SI approaches based on Discriminative VQ (DVQ) proposed in this dissertation are the DVQ for SI (DVQSI), the DVQSI with Unique speech feature vector space segmentation for each speaker pair (DVQSI-U), and the Adaptive DVQSI (ADVQSI) methods. The difference of the probability distributions of the speech feature vector sets from various speakers (or speaker groups) is called the interspeaker variation between speakers (or speaker groups). The interspeaker variation is the measure of template differences between speakers (or speaker groups). All DVQ based techniques presented in this contribution take advantage of the interspeaker variation, which are not exploited in the previous proposed techniques by others that employ traditional VQ for SI (VQSI). All DVQ based techniques have two modes, the training mode and the testing mode. In the training mode, the speech feature vector space is first divided into a number of subspaces based on the interspeaker variations. Then, a discriminative weight is calculated for each subspace of each speaker or speaker pair in the SI group based on the interspeaker variation. The subspaces with higher interspeaker variations play more important roles in SI than the ones with lower interspeaker variations by assigning larger discriminative weights. In the testing mode, discriminative weighted average VQ distortions instead of equally weighted average VQ distortions are used to make the SI decision. The DVQ based techniques lead to higher SI accuracies than VQSI. DVQSI and DVQSI-U techniques consider the interspeaker variation for each speaker pair in the SI group. In DVQSI, speech feature vector space segmentations for all the speaker pairs are exactly the same. However, each speaker pair of DVQSI-U is treated individually in the speech feature vector space segmentation. In both DVQSI and DVQSI-U, the discriminative weights for each speaker pair are calculated by trial and error. The SI accuracies of DVQSI-U are higher than those of DVQSI at the price of much higher computational burden. ADVQSI explores the interspeaker variation between each speaker and all speakers in the SI group. In contrast with DVQSI and DVQSI-U, in ADVQSI, the feature vector space segmentation is for each speaker instead of each speaker pair based on the interspeaker variation between each speaker and all the speakers in the SI group. Also, adaptive techniques are used in the discriminative weights computation for each speaker in ADVQSI. The SI accuracies employing ADVQSI and DVQSI-U are comparable. However, the computational complexity of ADVQSI is much less than that of DVQSI-U. Also, a novel algorithm to convert the raw distortion outputs of template-based SI classifiers into compatible probability measures is proposed in this dissertation. After this conversion, data fusion techniques at the measurement level can be applied to SI. In the proposed technique, stochastic models of the distortion outputs are estimated. Then, the posteriori probabilities of the unknown utterance belonging to each speaker are calculated. Compatible probability measures are assigned based on the posteriori probabilities. The proposed technique leads to better SI performance at the measurement level than existing approaches

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Anomaly detection: sparse representation for high dimensional data

Author: Zhou Guangyu
Publication venue: Electrical and Electronic Engineering, Imperial College London
Publication date: 01/12/2016
Field of study

In this thesis, I investigated in three different anomaly aware sparse representation approaches. The first approach focuses on algorithmic development for the low-rank matrix completion problem. It has been shown that in the l0-search for low- rank matrix completion, the singular points in the objective function are the major reasons for failures. While different methods have been proposed to handle singular points, rigorous analysis has shown that there is a need for further improvement. To address the singularity issue, we propose a new objective function that is continuous everywhere. The new objective function is a good approximation of the original objective function in the sense that in the limit, the lower level sets of the new objective function are the closure of those of the original objective function. We formulate the matrix completion problem as the minimization of the new objective function and design a quasi-Newton method to solve it. Simulations demonstrate that the new method achieves excellent numerical performance. The second part discusses dictionary learning algorithms to solve the blind source separation (BSS) problem. For the proof of concepts, the focus is on the scenario where the number of mixtures is not less than that of sources. Based on the assumption that the sources are sparsely represented by some dictionaries, we present a joint source separation and dictionary learning algorithm (SparseBSS) to separate the noise corrupted mixed sources with very little extra information. We also discuss the singularity issue in the dictionary learning process which is one major reason for algorithm failure. Finally, two approaches are presented to address the singularity issue. The last approach focuses on algorithmic approaches to solve the robust face recognition problem where the test face image can be corrupted by arbitrary sparse noise. The standard approach is to formulate the problem as a sparse recovery problem and solve it using l1-minimization. As an alternative, the approximate message passing (AMP) algorithm had been tested but resulted in pessimistic results. The contribution of this part is to successfully solve the robust face recognition problem using the AMP framework. The recently developed adaptive damping technique has been adopted to address the issue that AMP normally only works well with Gaussian matrices. Statistical models are designed to capture the nature of the signal more authentically. Expectation maximization (EM) method has been used to learn the unknown hyper-parameters of the statistical model in an online fashion. Simulations demonstrate that our method achieves better recognition performance than the already impressive benchmark l1-minimization, is robust to the initial values of hyper-parameters, and exhibits low computational cost.Open Acces

Spiral - Imperial College Digital Repository

Recommended from our members

Prediction of microbial communities for urban metagenomics using neural network approach.

Author: Jiang Jyun-Yu
Ju Chelsea J-T
Wang Wei
Zhou Guangyu
Publication venue: eScholarship, University of California
Publication date: 01/10/2019
Field of study

BACKGROUND:Microbes are greatly associated with human health and disease, especially in densely populated cities. It is essential to understand the microbial ecosystem in an urban environment for cities to monitor the transmission of infectious diseases and detect potentially urgent threats. To achieve this goal, the DNA sample collection and analysis have been conducted at subway stations in major cities. However, city-scale sampling with the fine-grained geo-spatial resolution is expensive and laborious. In this paper, we introduce MetaMLAnn, a neural network based approach to infer microbial communities at unsampled locations given information reflecting different factors, including subway line networks, sampling material types, and microbial composition patterns. RESULTS:We evaluate the effectiveness of MetaMLAnn based on the public metagenomics dataset collected from multiple locations in the New York and Boston subway systems. The experimental results suggest that MetaMLAnn consistently performs better than other five conventional classifiers under different taxonomic ranks. At genus level, MetaMLAnn can achieve F1 scores of 0.63 and 0.72 on the New York and the Boston datasets, respectively. CONCLUSIONS:By exploiting heterogeneous features, MetaMLAnn captures the hidden interactions between microbial compositions and the urban environment, which enables precise predictions of microbial communities at unmeasured locations

eScholarship - University of California

The Librating Companions in HD 37124, HD 12661, HD 82943, 47 Uma and GJ 876: Alignment or Antialignment?

Author: Ji Jianghui
Kinoshita H.
Li Guangyu
Liu Lin
Nakai H.
Zhou Jilin
Publication venue: 'University of Chicago Press'
Publication date: 23/05/2003
Field of study

We investigated the apsidal motion for the multi-planet systems. In the simulations, we found that the two planets of HD 37124, HD 12661, 47 Uma and HD 82943 separately undergo apsidal alignment or antialignment. But the companions of GJ 876 and

\upsilon

And are only in apsidal lock about

0^{\circ}

. Moreover, we obtained the criteria with Laplace-Lagrange secular theory to discern whether a pair of planets for a certain system are in libration or circulation.Comment: 13 Pages, 3 figures, 2 tables, Published by ApJ Letters, 591, July 1, 2003 (Figures now included to match the publication

arXiv.org e-Print Archive

Crossref

CERN Document Server

Near-Optimal MNL Bandits Under Risk Criteria

Author: Tao Chao
Xi Guangyu
Zhou Yuan
Publication venue
Publication date: 15/03/2021
Field of study

We study MNL bandits, which is a variant of the traditional multi-armed bandit problem, under risk criteria. Unlike the ordinary expected revenue, risk criteria are more general goals widely used in industries and bussiness. We design algorithms for a broad class of risk criteria, including but not limited to the well-known conditional value-at-risk, Sharpe ratio and entropy risk, and prove that they suffer a near-optimal regret. As a complement, we also conduct experiments with both synthetic and real data to show the empirical performance of our proposed algorithms.Comment: AAAI202

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Risk awareness, self-efficacy, and social support predict secure smartphone usage

Author: Gan Yiqun
Gou Mengke
Schwarzer Ralf
Zhou Guangyu
Publication venue
Publication date: 01/01/2020
Field of study

It is widely acknowledged that non-compliance with smartphone security behaviors is widespread and may cause severe harm to people and devices. In addition to device-based security issues, there are psychological factors involved in these behaviors such as self-efficacy, risk awareness, and social support. The present study examines associations of these three factors with smartphone security behaviors and explores possible mechanisms among these variables. In a longitudinal survey with 192 Chinese college students (73.4% women, mean age 24.46 years, SD = 5.15), self-efficacy, risk awareness, and social support were assessed with psychometric scales at two points in time, 2 weeks apart. Hierarchical regression analyses were performed with follow-up smartphone security behaviors as the dependent variable, controlling for baseline values and demographic and IT-related covariates. Main effects of self-efficacy, risk awareness, and social support on smartphone security behaviors were identified. Moreover, a triple interaction among the three predictors emerged in a synergistic way, indicating that their combination yielded more favorable levels of secure smartphone use. The total model accounted for 50% of the behavioral variance, with all covariates included, and the triple interaction among self-efficacy, risk awareness, and social support accounted for 2.3% of variance. Results document that psychological factors are involved in smartphone security behaviors beyond demographic and IT-related covariates. Interventions could be designed to improve smartphone security behaviors not only by developing privacy-enhancing technologies but also by considering psychological factors such as self-efficacy, risk awareness, and social support

Institutional Repository of the Freie Universität Berlin

Live head avatar using a single camera

Author: Cower Dillon
Enbom Per Niklas
Hefny Tarek
Zhou Guangyu
Publication venue: Technical Disclosure Commons
Publication date: 06/05/2019
Field of study

This disclosure describes generation of a photo-realistic, three-dimensional video of the head of a user using a single mobile device camera. The three-dimensional head video, referred to as an avatar, is live. The video closely resembles the user’s skin texture, and mimics the user’s facial gestures and expressions in real time. The live head avatar is generated and utilized with the user’s permission

Technical Disclosure Common

Building Potent Chimeric Antigen Receptor T Cells With CRISPR Genome Editing

Author: Guangyu Zhou
Guangyu Zhou
Jie Liu
Jie Liu
Li Zhang
Qi Zhao
Qi Zhao
Publication venue: 'Frontiers Media SA'
Publication date: 01/03/2019
Field of study

Chimeric antigen receptor (CAR) T cells have shown great promise in the treatment of hematological and solid malignancies. However, despite the success of this field, there remain some major challenges, including accelerated T cell exhaustion, potential toxicities, and insertional oncogenesis. To overcome these limitations, recent advances in CRISPR technology have enabled targetable interventions of endogenous genes in human CAR T cells. These CRISPR genome editing approaches have unleashed the therapeutic potential of CAR T cell therapy. Here, we summarize the potential benefits, safety concerns, and difficulties in the generation of gene-edited CAR T cells using CRISPR technology

Directory of Open Access Journals